13 research outputs found

    A unified censored normal regression model for qPCR differential gene expression analysis

    Get PDF
    Reverse transcription quantitative polymerase chain reaction (RT-qPCR) is considered as the gold standard for accurate, sensitive, and fast measurement of gene expression. Prior to downstream statistical analysis, RT-qPCR fluorescence amplification curves are summarized into one single value, the quantification cycle (Cq). When RT-qPCR does not reach the limit of detection, the Cq is labeled as undetermined . Current state of the art qPCR data analysis pipelines acknowledge the importance of normalization for removing non-biological sample to sample variation in the Cq values. However, their strategies for handling undetermined Cq values are very ad hoc. We show that popular methods for handling undetermined values can have a severe impact on the downstream differential expression analysis. They introduce a considerable bias and suffer from a lower precision. We propose a novel method that unites preprocessing and differential expression analysis in a single statistical model that provides a rigorous way for handling undetermined Cq values. We compare our method with existing approaches in a simulation study and on published microRNA and mRNA gene expression datasets. We show that our method outperforms traditional RT-qPCR differential expression analysis pipelines in the presence of undetermined values, both in terms of accuracy and precision

    Analysis of tiling array expression studies with flexible designs in Bioconductor (waveTiling)

    Get PDF
    Background: Existing statistical methods for tiling array transcriptome data either focus on transcript discovery in one biological or experimental condition or on the detection of differential expression between two conditions. Increasingly often, however, biologists are interested in time-course studies, studies with more than two conditions or even multiple-factor studies. As these studies are currently analyzed with the traditional microarray analysis techniques, they do not exploit the genome-wide nature of tiling array data to its full potential. Results: We present an R Bioconductor package, waveTiling, which implements a wavelet-based model for analyzing transcriptome data and extends it towards more complex experimental designs. With waveTiling the user is able to discover (1) group-wise expressed regions, (2) differentially expressed regions between any two groups in single-factor studies and in (3) multifactorial designs. Moreover, for time-course experiments it is also possible to detect (4) linear time effects and (5) a circadian rhythm of transcripts. By considering the expression values of the individual tiling probes as a function of genomic position, effect regions can be detected regardless of existing annotation. Three case studies with different experimental set-ups illustrate the use and the flexibility of the model-based transcriptome analysis. Conclusions: The waveTiling package provides the user with a convenient tool for the analysis of tiling array trancriptome data for a multitude of experimental set-ups. Regardless of the study design, the probe-wise analysis allows for the detection of transcriptional effects in both exonic, intronic and intergenic regions, without prior consultation of existing annotation

    Calculating bivariate orthonormal polynomials by recurrence

    No full text
    Emerson gave recurrence formulae for the calculation of orthonormal polynomials for univariate discrete random variables. He claimed that as these were based on the Christoffel–Darboux recurrence relation they were more efficient than those based on the Gram–Schmidt method. This approach was generalised by Rayner and colleagues to arbitrary univariate random variables. The only constraint was that the expectations needed are well-defined. Here the approach is extended to arbitrary bivariate random variables for which the expectations needed are well-defined. The extension to multivariate random variables is clear

    Analysis of tiling array expression studies with flexible designs in Bioconductor (waveTiling)

    No full text
    Abstract Background Existing statistical methods for tiling array transcriptome data either focus on transcript discovery in one biological or experimental condition or on the detection of differential expression between two conditions. Increasingly often, however, biologists are interested in time-course studies, studies with more than two conditions or even multiple-factor studies. As these studies are currently analyzed with the traditional microarray analysis techniques, they do not exploit the genome-wide nature of tiling array data to its full potential. Results We present an R Bioconductor package, waveTiling, which implements a wavelet-based model for analyzing transcriptome data and extends it towards more complex experimental designs. With waveTiling the user is able to discover (1) group-wise expressed regions, (2) differentially expressed regions between any two groups in single-factor studies and in (3) multifactorial designs. Moreover, for time-course experiments it is also possible to detect (4) linear time effects and (5) a circadian rhythm of transcripts. By considering the expression values of the individual tiling probes as a function of genomic position, effect regions can be detected regardless of existing annotation. Three case studies with different experimental set-ups illustrate the use and the flexibility of the model-based transcriptome analysis. Conclusions The waveTiling package provides the user with a convenient tool for the analysis of tiling array trancriptome data for a multitude of experimental set-ups. Regardless of the study design, the probe-wise analysis allows for the detection of transcriptional effects in both exonic, intronic and intergenic regions, without prior consultation of existing annotation.</p

    Estimates of the normalization factor of two representative samples ((A) sample 2, (B) sample 3) in the simulation study.

    No full text
    <p>Estimates are obtained by LMN (green solid line), MOD normalization (red dashed line) and MOD normalization on common targets (blue dotted line). The true normalization factor is represented by the horizontal line.</p

    Differential gene expression analysis of the multigene-expression signature for patients with neuroblastoma.

    No full text
    <p>Comparison of the significant (S) and non-significant (NS) differential expressed genes (5% false discovery rate) by UCNR <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0182832#pone.0182832.e009" target="_blank">model (3)</a> versus MNV+1. UCNR detects 5 extra differential expressed genes.</p

    Q-Q plot of the −log<sub>10</sub> <i>p</i>-values for the UCNR model (3) versus MNV+1.

    No full text
    <p>The <i>p</i>-values result from the differential gene expression analysis of the multigene-expression signature for patients with neuroblastoma. The solid line represents the bisector. The UCNR method typically has larger −log<sub>10</sub> <i>p</i>-values than the MVN+1 method, resulting in a higher sensitivity.</p

    A differentially expressed microRNA (true <i>δ</i><sub><i>i</i></sub> = −2) tracked during the simulation study.

    No full text
    <p>(A) Estimates of differential expression by UCNR (green solid line), multiple <i>t</i>-tests with MOD normalization and LOD imputation (red dashed line), MNV+1 imputation (blue dotted line) and KNN imputation (black dotted-dashed line). Censoring an observation at some point for this particular microRNA is marked by a black circle (MNA group) or a grey square (MNSC group) on the horizontal axis. (B) Plot of − log<sub>10</sub> <i>p</i>-values for the hypothesis test (<i>H</i><sub>0</sub>: <i>δ</i><sub><i>i</i></sub> = 0; <i>H</i><sub>1</sub>: <i>δ</i><sub><i>i</i></sub> ≠ 0). (C) Box plot of differential expression estimates.</p
    corecore